A Translation Framework for Executing the Sequential Binary Code on CPU/GPU Based Architectures
نویسندگان
چکیده
The method of using DBT (dynamic binary translation) to execute the source ISAs binary code on target platforms has been perplexed by low overhead for many years. GPU as a many-core processor has tremendous computational power. Employing GPU as a coprocessor to parallel execute the hot spot of binary code hold a great promise of substantially reduce the overhead of DBT. This paper presents a novel translation framework for constructing the virtual execution environment aiming at accelerating the process of DBT on CPU/GPU based architectures. With parallelizable parts (hot spots) of binary code and their related information, the framework converts the sequential code into PTX form and executes them on GPUs. Under the framework, we need not to rewrite the source code, and the binary compatibility issues between different GPUs are also resolved properly. Experimental results on several programs from CUDA SDK Code Samples and Parboil Benchmark Suite show that the framework can significantly improve the performance, usually have 10X speedup on average compared to X86 native platforms. Especially, when the scale of input become larger, the performance becomes even better.
منابع مشابه
Unleashing the Potential Impact of Nonessential Self-contained Software Units and Flexible Precedence Relations upon the Value of Software
The method of using DBT (dynamic binary translation) to execute the source ISAs binary code on target platforms has been perplexed by low overhead for many years. GPU as a many-core processor has tremendous computational power. Employing GPU as a coprocessor to parallel execute the hot spot of binary code hold a great promise of substantially reduce the overhead of DBT. This paper presents a no...
متن کاملMCUDA: An Efficient Implementation of CUDA Kernels on Multi-cores
The CUDA programming model, which is based on an extended ANSI C language and a runtime environment, allows the programmer to specify explicitly data parallel computation. NVIDIA developed CUDA to open the architecture of their graphics accelerators to more general applications, but did not provide an efficient mapping to execute the programming model on any other architecture. This document de...
متن کاملSpecial Issue on Parallel and Distributed Data Processing
As an exponentially increasing amount of data is generated everyday, parallel and distributed data processing has emerged as a key enabling technology and plays more and more important roles in modern computing and information technologies. In recent years, many researchers have become increasingly interested in the field of parallel and distributed data processing, and a large number of remark...
متن کاملThe Design and Implementation Ocelot’s Dynamic Binary Translator from PTX to Multi-Core x86
Ocelot is a dynamic compilation framework designed to map the explicitly parallel PTX execution model used by NVIDIA CUDA applications onto diverse many-core architectures. Ocelot includes a dynamic binary translator from PTX to many-core processors that leverages the LLVM code generator to target x86. The binary translator is able to execute CUDA applications without recompilation and Ocelot c...
متن کاملMethod and apparatus for determining branch addresses in programs generated by binary translation
Binary translation allows to maintain compatibility across different architectures while still executing at native speeds. To this end, the original program is treated as input to a binary translator which analyzes the program and generates equivalent code for the current base architecture. Binary translation can either occur as a separate step prior to program execution, also referred to as 's...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- JSW
دوره 6 شماره
صفحات -
تاریخ انتشار 2011